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Abstract 

Let X ^ Y ^ Z be a, discrete memoryless degraded broadcast channel (DBC) with marginal transition 
probability matrices Tyx and Tzx- This paper explores encoding schemes for DBCs. We call a DBC encoder 
in which symbols from independent codebooks each using the same alphabet as X are combined using the 
same single-letter function that adds distortion to the channel a natural encoding (NE) scheme. This paper 
shows that NE schemes achieve the boundary of the capacity region for the multi-user broadcast Z channel, 
the two-user group-additive DBC, and the two-user discrete multiplicative DBC. This paper also defines and 
studies the input-symmetric DBC and introduces a permutation encoding approach for the input-symmetric 
DBC and proves its optimality. 

Denote q as the distribution of the channel input X. Inspired by Witsenhausen and Wyner, this paper 
defines and studies the function F* . For any given q, and H{Y\X) < s < H{Y), where H{Y\X) is the 
conditional entropy of Y given X and H{Y) is the entropy of Y , define the function F^^^ Tzx^I' 
infimum of H{Z\U), the conditional entropy of Z given U with respect to all discrete random variables U 
such that a) H{Y\U) — s, and b) U and Y, Z are conditionally independent given X. This paper studies 
the function F*, its properties and its calculation. This paper then represents the capacity region of the 
DBC X ^ Y ^ Z using the function F^^^ . Finally, this paper applies these results to several classes 
of DBCs and their encoders as discussed above. 
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I. Introduction 

In the 70's, Cover [Ij, Bergmans [2] and Gallager [3] estabhshed the capacity region for 
degraded broadcast channels (DBCs). A common optimal transmission strategy to achieve 
the boundary of the capacity region for degraded broadcast channels is the joint encoding 
scheme presented in |4j . Specifically, the data sent to the user with the most degraded channel 
is encoded first. Given the codeword selected for that user, an appropriate codebook for the 
user with the second most degraded channel is selected, and so forth. 
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An independent-encoding scheme can also achieve the capacity of any DBC, as described 
in Appendix I [5] . This scheme essentially embeds all symbols from all the needed codebooks 
for the less-degraded user into a single super-symbol (but perhaps with a large alphabet). 
Then a single-letter function uses the input symbol from the more-degraded user to extract 
the needed symbol from the super symbol provided by the less-degraded user. 

Cover [6] introduced an independent-encoding scheme for two-user broadcast channels. 
When applied for two-user degraded broadcast channels, this scheme independently encodes 
users' messages, and then combines these resulting codewords by applying a single-letter 
function. This scheme does not specify what codebooks to use or what single-letter function 
to use. It is a general independent-encoding approach, which includes the independent- 
encoding scheme in Appendix I. 

A simple encoding scheme that is optimal for some common DBCs is an independent- 
encoding approach in which symbols from independent codebooks each with the same al- 
phabet as X are combined using the same single-letter function that adds distortion to the 
channel. We refer to this encoding scheme as the natural encoding (NE) scheme. As an ex- 
ample, the NE scheme for a two-receiver broadcast Gaussian channel has as each transmitted 
symbol the real addition of two real symbols from independent codebooks. 

The NE scheme is known to achieve the boundary of the capacity region for several broad- 
cast channels including broadcast Gaussian channels [7], broadcast binary-symmetric chan- 
nels [2] [8] [9] [TOj, discrete additive degraded broadcast channels [TT] and two-user broadcast 
Z channels [12] [13]. This paper shows that NE schemes also achieve the boundary of the 
capacity region for the multi-user broadcast Z channel, the two-user group-additive DBC, 
and the two- user discrete multiplicative DBC. 

Shannon's entropy power inequality (EPI) [ll] gives a lower bound on the differential 
entropy of the sum of independent random variables. In Bergmans's remarkable paper [7], 
he applied EPI to establish a converse showing the optimality of the scheme given by [1] [2] 
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(the NE scheme) for broadcast Gaussian channels. "Mrs. Gerber's Lemma" [15] provides 
a lower bound on the entropy of a sequence of binary-symmetric channel outputs. Wyner 
and Ziv obtained "Mrs. Gerber's Lemma" and applied it to establish a converse showing 
that the NE scheme for broadcast binary-symmetric channels suggested by Cover [I] and 
Bergmans [2] achieves the boundary of the capacity region [8]. EPI and "Mrs. Gerber's 
Lemma" play the same significant role in proving the optimality of the NE schemes for 
broadcast Gaussian channels and broadcast binary-symmetric channels. Witsenhausen and 
Wyner studied a conditional entropy bound for the channel output of a discrete channel 
and applied the results to establish an outer bound of the capacity region for DBCs [9J 
[To] . For broadcast binary-symmetric channels, this outer bound coincides with the capacity 
region. This paper extends ideas from Witsenhausen and Wyner [10] to study a conditional 
entropy bound for the channel output of a discrete DBG and represent the capacity region 
of discrete DBGs with this conditional entropy bound. This paper simplifies the expression 
of the conditional entropy bound for broadcast binary-symmetric channels and broadcast 
Z channels. For broadcast Z channels, the simplified expression of the conditional entropy 
bound demonstrates that the NE scheme provided in ^12j, which is optimal for two-user 
broadcast Z channels, is also optimal for multi-user broadcast Z channels. 

Witsenhausen and Wyner made two seminal contributions in [9] and [10]: the notion of 
minimizing one entropy under the constraint that another related entropy is fixed and the 
use of input symmetry as a way of solving an entire class of channels with a single unifying 
approach. Benzel [TT] used the first idea to study discrete additive degraded broadcast 
channels. Recently Liu and Ulukus [16] [IT] used both ideas together to extend Benzels 
results to include the larger class of discrete degraded interference channels (DDIG). Our 
paper defines what it means for a degraded broadcast channel to be input-symmetric (IS) and 
provides an independent-encoding scheme which achieves the capacity region of all input- 
symmetric DBGs. 

3 



The input-symmetric channel was introduced by Witsenhausen and Wyner [TO] and studied 
in [16] [I7] and [18]. This paper extends the definition of the input-symmetric channel to 
the definition of the input-symmetric DBC. This paper introduces an independent-encoding 
scheme employing permutation functions of independently encoded streams (the permutation 
encoding approach) for the input-symmetric DBC and proves its optimality. The discrete 
additive DBC [11] is a special case of the input-symmetric DBC, and the optimal encoding 
approach for the discrete additive DBC [Hj is also a special case of the permutation encoding 
approach. The group-additive DBC is a class of input-symmetric DBCs whose channel 
outputs are group additions of the channel input and noise. The permutation encoding 
approach for the group- additive DBC is the group-addition encoding approach, which is the 
NE scheme for the group- additive DBC. 

The discrete multiplicative DBC is a discrete DBC whose channel outputs are discrete 
multiplications (multiplications in a finite field) of the channel input and noise. This paper 
studies the conditional entropy bound for the discrete multiplicative DBC, and proves that 
the NE scheme achieves the boundary of the capacity region for discrete multiplicative DBCs. 

This paper is organized as follows: Section HTl defines the conditional entropy bound F*{-) 
for the channel output of a discrete DBC and represents the capacity region of the discrete 
DBC with the function F*. Section UTTl establishes a number of theorems concerning various 
properties of F*. Section HVl evaluates F*{-) and indicates the optimal transmission strategy 
for the discrete DBC. For the multi-user broadcast Z channel. Section |V] proves the opti- 
mality of the NE scheme. Section |Vl] introduces the input-symmetric DBC, provides the 
permutation encoding approach, an independent-encoding scheme for the input-symmetric 
DBC and proves its optimality. Section IVIII studies the discrete multiplicative DBC and 
shows that the NE scheme achieves the boundary of the capacity region for the discrete 
multiplicative DBC. Section rVIIII delivers the conclusions. 
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II. The Conditional Entropy Bound 



Let X ^ Y ^ Z he a. discrete memoryless DBC where X G {1, 2, ■ ■ ■ ,k},Y& {1, 2, ■ ■ ■ ,n} 
and Z G {1,2,--- ,m}. Let Tyx be an n x stochastic matrix with entries Tyx(j, = 
Pr(y = j\X = i) and Tzx be an m x stochastic matrix with entries Tzx{jii) = Pt{Z = 
j\X = i). Thus, Tyx and Tzx are the marginal transition probabihty matrices of the 
degraded broadcast channel. 

Definition 1: Let vector q in the simplex of probability fc- vectors be the distribution 
of the channel input X. Define the function F^^^^ j^^^{q, s) as the infimum of H{Z\U), the 
conditional entropy of Z given U, with respect to all discrete random variables U such that 

. a) H{Y\U) = s; 

• h) U and Y, Z are conditionally independent given X, i.e., the sequence U, X, Y, Z forms a 
Markov chain U ^ X ^Y ^ Z. 

For any fixed vector q, the domain of F^^^ Tzx ^ closed interval [H{Y\X), H{Y)], 

where H{Y\X) is the conditional entropy of Y given X and H{Y) is the entropy of Y. This 
will be proved later in Lemma [3l The function F*{-) is an extension to the function F(-) 
introduced in [lOj. We will use F^^^ q.^^{q,s), F*{q,s) and F*{s) interchangeably. 

Theorem 1: F^^^ rp^^{q, s) is monotonically nondecreasing in s and the infimum in its 
definition is a minimum. Hence, F^^^ j,^^{q, s) can be taken as the minimum H{Z\U) with 
respect to all discrete random variables U such that 

. a) H{Y\U) > s; 

» h) U and Y, Z are conditionally independent given X. 
The proof of Theorem [T] will be given in Section IIIII 

Theorem 2: The capacity region for the discrete memoryless degraded broadcast channel 

5 



X ^ Y ^ Z is the closure of the convex hull of all rate pairs (-Ri, R2) satisfying 



0<Ri<I{X;Y), (1) 
R, < H{Z) - F^,,,T,,(g, Ri + H{Y\X)), (2) 

for some q G A^, where /(X; Y) is the mutual information of between X and Y, H{Y\X) 
is the conditional entropy of Y given X, and H{Z) is the entropy of Z resulting from the 
channel input's distribution q. Thus, for a fixed input distribution q and for A > 0, finding 
the maximum of R2 + XRi is equivalent to finding the minimum of F*[q, s) — As as follows: 

max(i?2 + Ai?i) = max {H{Z) - F*{q, Ri + H{Y\X)) + Ai?i + XH{Y\X) - XH{Y\X)) 

= H{Z) - \H{Y\X) + max {-F\q, Ri + H{Y\X)) + A(i?i + H{Y\X))) 
= H{Z) - XH{Y\X) - min {F*{q, s) - As) (3) 



Proof: The capacity region for the DBC is known in [T] [3] |1] as 



CO 



U R2) : Ri < I{X- Y\U), R2 < I{U- Z)] 

p{u),p{x\u) 



(4) 



where co denotes the closure of the convex hull operation, and U is the auxiliary random 
variable which satisfies the Markov chain U X Y Z and \U\ < mindA"!, |3^|, 
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Rewrite (jl]) and we have 



CO 



=co 



=co 



U R2) : Ri < I{X- Y\U), R2 < I{U- Z)} 

p{u),p{x\u) 

U \ U {{Ri,R2):Ri<I{X;Y\U),R2<I{U;Z)} 

Px=Q^^k \p{u,x) with Px=Q 



(5) 



=co 



=co 



U < U {iRi,R2):Ri<H{Y\U)-H{Y\X),R2<H{Z)-H{Z\U)} 

Px=Q^^k \^p{u,x) with Px=Q 

(6) 

U \ U {{Ri,R2)--Ri<s-H(Y\X),R2<HiZ)-F*^^^^^Jq,s)} 

Px=q<^^k yH{Y\X)<s<H{Y) 

(7) 

U {{R1.R2) : < i?i < I{X-Y),R2 < H{Z) - F*^^ j,^^{q,R, + H{Y\X))} 



where Px is the vector expression of the distribution of channel input X. Some of these 
steps are justified as follows: 

. © follows from the equivalence of Up(«),p(x|„) and Up^=q6A, Up(.,x) withp^=q; 

• dZD follows from the definition of the conditional entropy bound F*\ 

• (IHI) follows from the nondecreasing property of F*{s) in Theorem [1], which allows the 
substitution s = Ri + H{Y\X) in the argument of F* . Q.E.D. 

Note that for a fixed distribution = q of the channel input X, the items I{X;Y), 
H[Z) and H{Y\X) in are constants. This theorem provides the relationship between 
the capacity region and the conditional entropy bound F* for a discrete degraded broadcast 
channel. It also motivates the further study of F*. 
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III. Some Properties of 

In this section, we will extend ideas from [10] to establish several properties of the condi- 
tional entropy bound F*{-). In [10], Witsenhausen and Wyner defined a conditional entropy 
bound F{-) for a pair of discrete random variables and provided some properties of F{-). 
The definition of F{-) is restated here. Let X Z he a discrete memoryless channel with 
the m X k transition probability matrix T, where the entries T(j, i) = Pt{Z = j\X = i). Let 
q be the distribution of X. For any q G A^, and < s < H{X), the function F^iq, s) is the 
infimum of H{Z\U) with respect to all discrete random variables U such that H{X\U) = s 
and the sequence U,X,Z is a Markov chain. By definition, FT{q, s) = Fjrp{q, s), where I is 
an identity matrix. Since F*{-) is the extension of F{-), most of the properties of F*{-) in 
this section are generalizations of properties of F{-) in |10j . 

For any choice of the integer I > 1, w = [wi, ■ ■ ■ , Wi]'^ G A; and Pj G A^ for j = 1, ■ ■ ■ , /, 
let U he a /-ary random variable with distribution w, and let Txu = [Pi'''Pi] be the 
transition probability matrix from U to X. We can compute 

I 

P = Px = Txuw = WjPj (9) 
I 

e = HiY\U) = J2wMTyxPj) (10) 

I 

= H{Z\U) = Y,w,K,{TzxPj) (11) 

where /i„ : A„ — > R is the entropy function, i.e., /i„(pi, ■ ■ ■ ,p„) = — ^pjlnpj. Thus the 
choices of U satisfying conditions a) and b) in the definition of F^^^ j,^^ (qr, s) corresponds 
to the choices of l,w and for which ([9]) (fTOj) yields p = q and ^ = s. 

Let 5 = {{p,hn{TYxp),hm{TzxP)) e A^x [0,lnn] x [0,lnn]|p G AJ. Since A^ is (k-l)- 
dimensional, A^ x [0, Inn] x [0, Inn] is a (A; + l)-dimensional convex polytope. The mapping 
p (p, hniTyxp), hmiTzxp)) assigus a point in S for each p G A^. Because this mapping 
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is continuous and the domain of the mapping, A^, is compact and connected, the image S 
is also compact and connected. 

Let C be the set of all (p, ^, rj) satisfying Qj ffTOj) and ffTTl) for some choice of /, w and pj. 
By definition, the set C is the convex hull of the set S. Thus, C is compact, connected, and 
convex. 

Lemma 1: C is the convex hull of S, and thus C is compact, connected, and convex. 

Lemma 2: i) Every point of C can be obtained by ([9]) ( ITOl) and ( ITTil with / < A; + 1. In 
other words, one only need to consider random variables U taking at most k + 1 values, 
ii) Every extreme point of the intersection of C with a two-dimensional plane can be obtained 
with I < k. 

The proof of Lemma [2] is the same as the proof of a similar lemma for F{-) in ^U\. The 
details of the proof are given in Appendix II. 

Let C* = {i^,f])\{Q^^jV) G C } be the projection of the set C onto the (^, ?7)-plane. Let 
Q = {{^^v)\{Q^^^v) G C } be the projection onto the (^,r7)-plane of the intersection of C 
with the two-dimensional plane p = q. By definition, C* = UqeA^ Also, C* and C* 
are compact and convex. By definition, F^^^ j,^^ {q, s) is the infimum of all rj, for which C* 
contains the point {s,ri). Thus 

Ft^x,t,A9^s) = mi{r,\{q,s,v) e C} = inf{r/|(., r^) G C*}. (12) 

Lemma 3: For any fixed q as the distribution of X, the domain of F^^^ Tzx^^^ ^ 
the closed interval [H{Y\X), H{Y)], i-e.,[X]i=i QihuiTyx^i), hniTyxq)]^ where Cj is a vector, 
for which the i^^ entry is 1 and all other entries are zeros. 

Proof: For any Markov chain U ^ X — F, by the Data Processing Theorem |19j . 
H(Y\U) > H{Y\X) and the equality is achieved when the random variable U = X. One 
also has H{Y\U) < H{Y) and the equality is achieved when ?7 is a constant. Thus, the 
domain of F^^^ rp^^{q, s) in s is [H{Y\X), H{Y)] for a fixed distribution of channel input 
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X. Since q is the distribution of X, H{Y\X) = J^Lil^^niTyxei) and H{Y) = hniTyxq)- 
Q.E.D. 



Theorem 3: The function F^^^ j,^^ {q, s) is defined on the compact convex domain {{q,s)\q G 
Ak,Yl^=iQihn{TYxei) < s < hniTyxq)} and for each {q,s) in this domain, the infimum in 
its definition is a minimum, attainable with U taking at most A; + 1 values. 

Proof: By Lemma [31 the function F* is defined on the compact domain {{q,s)\q G 
Ak,Yl^i=iQihn{TYxei) < s < hniTyxq)}- This domain is convex because is convex, 
the entropy function hniTyxq) is concave in q and Yli=i QihniTyx^i) is linear in q. For each 
(g, s) in this domain, the set {r/Ks,//) G C*} is non-empty. It is in fact a compact interval 
since C* is compact. Therefore, 

^Tyx,Tzx(9>s) = inf{r?|(s,r7) G Cq} = min{r/|(s,r/) G Cq} = min{r/|(qr, s, r/) G C}. (13) 

By Lemma [2] i), this minimum is attained with U taking at most k + 1 values. Q.E.D. 

By Lemma [2] ii), the extreme points of C* can be attained by convex combinations of at 
most k points of S. Thus, every linear function of (^, t]) could attain its minimum with U 
taking at most k value since every linear function of (^, rj) achieves its minimum over C* at 
an extreme point of the compact set C*. 

Lemma 4-' The function F^^^ Tzx^^^ jomtlj convex in (q, s). 

Proof: F^^^ rp^^iq, s) is jointly convex in iq,s) because C is a convex set. In particular, 
the domain of F* is convex by Theorem [3l For any two points (qi,si) and (92,-32) in the 
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domain, and for any < 6 < 1, 

^^.x,T.x(^gi + (1 - o)q2, es, + (1 - e)s2) 

= mm{r]\{eq^ + (1 - e)q^, Os, + (1 - e)s2, v) e C} 
<min{6'?7i + (1 - 0)r]2\{Qi, Si,r]i), {q^, 82,112) G C} 
=(^F^rx,T,M^,Sr) + (1 - e)F^^^^T,M2,S2). 

Therefore, F^^^ t^AQi s) is jointly convex in (q, s). Q.E.D. 



Now we give the proof of Theorem [TJ Since Theorem [3] has shown that the infimum 
in the definition of F* is a minimum, it suffices to show that F*{s) = F^yx Tzx^^i 
monotonically nondecreasing in s. For any fixed q, the domain of s is [H(Y\X), H{Y)\. On 
the one hand, 

F*{q,H{Y\X)) = mm{HiZ\U)\p^ = q,H{Y\U) = HiY\X)} 
< mm{H{Z\U)\px = q,U = X} 

= H{Z\X). (14) 



On the other hand, for any s G [H{Y\X), H{Y)], 



F*{q,s)=mm{H{Z\U)\pj, = q,H{Y\U) = s} 

> mm{HiZ\U,X)\px = q,H{Y\U) = s} (15) 
= H{Z\X), (16) 



where (1151) follows from H{Z\U) > H{Z\U, X) and fll6p follows from the conditional in- 
dependence between Z and U given X. Inequalities (HM and flTB]) imply that for any 
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se[H{Y\X),H{Y)], 

F*iq,s)>F*{q,H{Y\X)). (17) 

Combining (fT7|) and the fact that F*{q, s) is convex in s for any fixed q, we have F*{q, s) is 
monotonically nondecreasing in s. Q.E.D. 

The proof of Theorem [1] also gives an endpoint of F*{s), 

F*{q,H{Y\X)) = HiZ\X), (18) 

which is achieved when U = X. The following theorem will provide the other endpoint, 

F*{q,H{Y)) = H{Z), (19) 

which is obtained when f/ is a constant. 

Theorem 4: For H{Y\X) <s< H{Y), a lower bound of F*{s) is 

F*{s) > s + H{Z) ~ H{Y). (20) 

F*{s) is differentiable at all but at most countably many points. At differentiable points of 

F*is), 

dF*(s) , , 

0< — r^<l. (21) 

as 

Proof: 

I{U-Z)<I{U-Y) (22) 
^H{Z) - H{Z\U) < H{Y) - H{Y\U) 
^H{Z\U) > H{Y\U) + H{Z) - H{Y) 

^F*{s)>s + H{Z)-H{Y). (23) 
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Fig. 1 

Illustrations of the curve F*{q,s) — F^^^ j,^^{q,s) shown in bold, the region C*, and the 

POINT (0, i]j{q, A)). 



Some of these steps are justified as follows: 

• (!22l) follows from the Data Processing Theorem [19] : 

• fl23l) follows from the definition of F*{s). 

When the random variable t/ is a constant, H(Y\U) = H(Y) and H{Z\U) = H{Z). Thus, 
equality in fl23|) is attained when s = H{Y). Since F*{s) is convex in s, it is differentiable 
at all but at most countably many points. If F*{s) is differentiable at s = H{Y), then 
^^^^j^ < 1 because the line s + H{Z) — H(Y) with slope 1 supports the curve F*(s) at 

its end point {H{Y), F*{H{Y))). For any H{Y\X) <s< H{Y) where F*(s) is differentiable, 
since F*{s) is convex, the slope of the supporting line at the point (s, is less than or 

equal to the slope of the supporting line s + H{Z) - H(Y) at the point {H(Y), F*{H(Y))). 
Thus, for any H{Y\X) < s < H{Y) where F*{s) is differentiable 

dF*is) , 
as 

'^^ Jf^ > because F*(s) is monotonically nondecreasing. The illustrations of the function 
F*{s) = i^^y^,Tzx(9'*) and C* are shown in Fig. [H Q.E.D. 
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For X ~ q, where g is a fixed vector, by Theorem [2l finding the maximum of R2 + XRi 
is equivalent to finding the minimum of F*{q, s) — As. Theorem H] indicates that for every 
A > 1, the minimum of F*{q, s) — As is attained when s = H{Y) and F*{s) = H{Z), i.e., U 
is a constant. Thus, the non-trivial range of A is < A < 1. 

The following theorem is the key to the applications in Section |V] and is an extension and 
generalization of Theorem 2.4 in |10j. Let X = {Xi, ■ ■ ■ , X^) be a sequence of channel inputs 
to the degraded broadcast channel X ^ Y ^ Z . The corresponding channel outputs are 
Y = (Yi, ■ ■ ■ , Yjy) and Z = {Zi, ■ • ■ , Zjy). Thus, the sequence of the channel outputs {Yi, Zi), 
i = 1, ■ ■ ■ , N, are conditionally independent of each other given the channel inputs X. Note 
that the channel outputs {Yi, Zi) do not have to be identically or independently distributed 
since Xi,--- ,Xn could be correlated and have different distributions. Denote Qi as the 
distribution of Xi for i = 1, ■ ■ ■ , N. Thus, q = XI Qi/-^ is the average of the distribution of 
the channel inputs. For any q G A^, define -^*(]v) (n) {q, ^s) be the infimum of H{Z\U) with 
respect to all random variables U and all possible channel inputs X such that H{Y\U) = Ns, 
the average of the distribution of the channel inputs is q and U —>-X^Y—>-Zisa. Markov 
chain. 

Theorem 5: For all = 1, 2, ■ ■ ■ , and all Tyx,Tzx, Q, and H(Y\X) < s < H(Y), one has 

F;,^, ,^,{q,Ns) = NFl^^^^^{q,s). (25) 

^YX ZX 

Proof: We first prove that F* ,^^{q,Ns) > NF^^^ j.^^{q,s). Since 

^YX ZX 

N 

Ns = H{Y\U) = H{nY,, • • ■ , U) (26) 
1=1 

N 

2=1 
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where Si = H{Yi\Yi, ■ ■ ■ , Yi_i, U) and fl26l) follows from the chain rule of entropy 

N 

H{Z\U) = H{Z,\Z^, ■■■ , U) (28) 
1=1 

N 

>J2h{Z,\Z,,--- ,Z,^i,Y,,--- ,Yi^i,U) (29) 

i=l 
N 

= J2h{Z,\Y^,--- ,Y,_,,U) (30) 
1=1 

TV 

>J2^T,x,T,Aqi,s^) (31) 

AT TV 

> N^Trx,T,. (E 9^/^' E ^^/^) (32) 

i=l i=l 

= NFT^x,T,Aq^s). (33) 
Some of these steps are justified as follows: 



follows from the chain rule of entropy 

holds because conditional entropy decreases when the conditioning increases; 

• (1301) follows from the fact that Zi and Zi,- ■■ , Zj-i are conditionally independent given 

• (1311) follows from the definition of F* if considering the Markov chain {U, Yi,--- , Fj_i) 
Xi ^ Yi ^ Zi] 

• (l32l) results from applying Jensen's inequality to the convex function F*. 



By the definition of -F*(jv) (N)iQ^Ns), Equation (1551) implies that 



i^>)^w(q,Ars) > NF*^^^^^^{q,s). (34) 



On the other hand, in the case that U is composed of N independently identically distributed 
(i.i.d.) random variables {Ui, ■ ■ ■ , Un), and each Ui — > Xi achieves = q, H(Yi\Ui) = s 
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and H{Z,m = F*^^^T^^{q,s), one has H{Y\U) = Ns and H{Z\U) = NF*^^^^^^{q, s). 
Since -F*(]v) (n) is defined by taking the minimum, 

{q,Ns)<NF*^^^T^^{q,s). (35) 

Combining and ([35]), one has F* ,^^{q,Ns) = NF*^^ T^^{q,s). Q.E.D. 

^YX'^ZX 

Theorem O indicates that if using the degraded broadcast channel X ^ Y ^ Z for 
times, and for a fixed q as the average of the distribution of the channel inputs, the condi- 
tional entropy bound -F*(jv) (jv)(g, A^s) is achieved when the channel is used independently 
and identically for times, and single use of the channel at each time achieves the condi- 
tional entropy bound F^^^ rp^^{q, s). 

IV. Evaluation of F*{-) 

In this section, we evaluate F*{s) = F^^^ Tzx^^' ^ duality technique, which is also 

used for evaluating F{-) in [lOj. This duality technique also provides the optimal transmission 
strategy for the DBC X Y —* Z to achieve the maximum of R2 + XRi for any A > 0. 

Theorem [3] shows that 

^Tyx,Tzx(^^^) = min{r7|(s,r/) G Cq} = mm{r]\{q, s,r]) G C}. (36) 

Thus, the function F^^^ rp^^{q, s) is determined by the lower boundary of C*. Since C* is 
convex, its lower boundary can be described by the lines supporting its graph from the below. 
The line with slope A in the (^,?7)-plane supporting C* as shown in Fig. [T]has the equation 

r] = \^ + ^{q,\), (37) 



where ip{q, A) is the r^-intercept of the tangent line with slope A for the function F^^^ j,^^ {q, s). 
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Thus, 



A) = min{F*(q, - Ae|^(l^|X) < ^ < HiY)} (38) 
= min{r/-Ae| (^,r/)GCj (39) 
= mm{ri-X^\iq,^,r])eC}. (40) 

For i^(F|X) < s < H{Y), the function -F*(s) = F^^^ Tzx^I' '"^'^ represented as 

F*{s) = max{i/j{q, A) + As| - oo < A < oo}. (41) 

Theorem [1] shows that the graph of F*{s) is supported at s = H{Y\X) by a hne of slope 0, 
and Theorem m shows that the graph of F*{s) is supported at s = H{Y) by a hne of slope 
1. Thus, for H{Y\X) < s < H{Y), 

F*{s) = max{ip{q,X) + Xs\0 < \<l}. (42) 

Let L\ be a linear transformation {q, ^, t]) — > {q,V ~ ■^O- maps C and S onto the sets 

CA = {(q,r/-A0|(g,e,r7)GC}, (43) 

and 

Sx = {{q, hUTzxq) - XK{TYxq))\q G A,}. (44) 

The lower boundaries of C\ and S\ are the graphs of ip{q,X) and (j){q,X) = h^iTzxq) — 
XhniTyxq) respectively. Since C is the convex hull of iS, and thus Cx is the convex hull of 
S\, ip{q, A) is the lower convex envelope of </)(qr. A) on A^. 

In conclusion, ■?/'(■, A) can be obtained by forming the lower convex envelope of 0(-, A) for 
each A and F*{q, s) can be reconstructed from ip{q, A) by fH2]) . This is the dual approach to 
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the evaluation of F*. 

Theorem [2] represents the capacity region for a DBC by the function F*{q,s). Since 
ip{q,X) and F*{q,s) can be constructed by each other from (138!) and (l42l) . for any A > 0, 
the associated point on the boundary of the capacity region may be found (from its unique 
value of R2 + XRi) as follows 

maxmax{i?2 + XRi\px = q} 
= maxmax{H{Z) - F*{q, s) + Xs - XH{Y\X)} 
= max{H{Z) - XH{Y\X) - min{F*(q, s) - Xs}) 

= max{H{Z) - XH{Y\X) - ilj{q, A)). (45) 
qeAfc 

We have shown the relationship among F*, ip and the capacity region for the DBC. Now 
we state a theorem which provides the relationship among F*{q, s), ip{q, A), 0(g, A), and the 
optimal transmission strategies for the DBC. 

Theorem 6: i) For any < A < 1, if a point of the graph of A) is the convex combi- 
nation of / points of the graph of 0(-, A) with arguments Pj and weights wx, j = 1, ■ ■ ■ , /, 
then 

^Tyx ,Tzx ( J2 ^^Pj ' {TvxPj ) j = ^Wjh^ {TzxPj ) • (46) 

Furthermore, for a fixed channel input distribution q = "^jWjPj, the optimal transmission 
strategy to achieve the maximum of R2 + XR1 is determined by l,Wj and Pj. In particular, an 
optimal transmission strategy has \U\ = I, Pr(f/ = j) = Wj and Px\u=j = Pji where Px\u=j 
denotes the conditional distribution of X given U = j. 

ii)For a predetermined channel input distribution q, if the transmission strategy \U\ = I, 
Pr(f/ = j) = Wj and Px\u=j = Pj achieves max{i?2 + A-Ri| 'Ylj'^jPj = 9}' then the point 
(qr,'?/'(q. A)) is the convex combination of / points of the graph of 0(-, A) with arguments Pj 
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and weights wx, j — 1, - • ■ ,1. 

The proof is given in Appendix III. 

Note that if for some pair (qr, A), ■0(qf, A) = (f){q, A), then the corresponding optimal trans- 
mission strategy has I — 1, which means C/ is a constant. Thus, the hne rj — + 'ijj{q, A) 
supports the graph of F*{q, •) at its endpoint {H{Y), H{Z)) = (hniTyxQ), hm{TzxQ))- 

A. Example: broadcast binary- symmetric channel 

For the broadcast binary-symmetric channel X ^ Z with 



YX 



1 — a-i ai 

CKl 1 — CKi 



zx 



1 — a2 0L2 
a2 I — Oi2 



(47) 



where < cti < 0:2 < 1/2, one has 



0(p,A) = 0((p,i-pr,A) 

= hmiTzxq) - XhniTyxq) 

= /i((l - a2)p + a2{l -p)) - A/i((l - ai)p + ai{l -p)), 



(48) 



where h{x) — —xlnx — {1 — x) ln(l — x) is the binary entropy function. Taking the second 
derivative of 0(p, A) with respect to p, we have 



b"{p,X)^ 
+ 



-(1 - 2a2y 



{a2P + (1 - q;2)(1 - p))((l - a2)p + 0:2(1 - p)) 
A(l - 2q;i)2 



{aip + {1- - p))((l - ai)p + ai{l - p)) ' 



(49) 



which has the sign of 



A) = -(^-V)(^+V) + -P)(^+P)- (50) 



'l-2ai ''l-2ai 



l-2ao 



l-2ao 
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For any < A < 1 



mill p{p, A) 
p 



A 1 



(51) 



4(1-2^2)2 4(l-2ai)2' 



Thus, for A > (1 - 2a2)V(l - 2«l)^ A) > for all < p < 1, and so ^(p, A) = <p{p, A). 
In this case, the optimal transmission strategy achieving the maximum of -Ri also achieves 
the maximum of R2 + Ai?i, and thus the optimal transmission strategy has / = 1, which 
means ?7 is a constant. 

Note that 0(l/2+p, A) = 0(l/2-p, A). For A < (1 -2a2)V(l - 2«i)^ A) has negative 
second derivative on an interval symmetric about p = 1/2. Let p\ = argminp0(p. A) with 
Px < 1/2- Thus px satisfies (pp{px, A) = 0. 

By symmetry, the envelope ■?/'(■, A) is obtained by replacing (f){p, A) on the interval {px, 1 — 
Px) by its minimum over p, which is shown in Fig. [2l Therefore, the lower envelope of (f){p, A) 
is 



I (f){p,X), otherwise. 

For the predetermined distribution of X, = q = {q,l — q)^ with px < q < 1 — px, 
(g, ipiq, A)) is the convex combination of the points {px, i'ipx, A)) and (1 — px, '?/^(l — Px, A)). 




(52) 
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Therefore, by Theorem [6], F*{q,s) = h-^iTzx ■ {p\A ~ VxY) = h{a2 + (1 — 2a2)px) for 
s = h2{TYx ■ {px, 1 - Pa)^) = h{ai + (1 - 2ai)px), and < px < q oi 1 - q < px < I. 
This defines F*{q, ■) on its entire domain [h{ai),h{ai + (1 - 2ai)g)], i.e., 
For the predetermined distribution of X, q = (g, 1 — q)'^ with q < px or q > 1 — Pa, 
one has (j){q,X) = ip{q,X), which means that a fine with sfope A supports F*{q,-) at point 
s = H{Y) = h{ai + (1 — 2ai)q, and thus the optimal transmission strategy has / = 1, which 
means U is a constant. 



V. Broadcast Z Channels 

The Z channel, shown in Fig. [3](a), is a binary asymmetric channel which is noiseless when 
symbol 1 is transmitted but noisy when symbol is transmitted. The channel output Y 
is the binary OR of the channel input X and Bernoulli distributed noise with parameter 
a. The capacity of the Z channel was studied in [20j. The Broadcast Z channel is a class 
of discrete memoryless broadcast channels whose component channels are Z channels. A 
two-user broadcast Z channel with marginal transition probability matrices 



1 ttl 




1 a2 




5 Tzx — 




1 -ai 




l-as 



where < ai < a2 < 1, is shown in Fig [3t^b). The two-user broadcast Z channel is 
stochastically degraded and can be modeled as a physically degraded broadcast channel as 
shown in Fig. HI where a a = (02 — ~ <^i) [E]- In the NE scheme for broadcast Z 

channels, the transmitter first independently encodes users' information messages into binary 
codewords and then broadcasts the binary OR of these encoded codewords. The NE scheme 
achieves the whole boundary of the capacity region for the two-user broadcast Z channel [I2] 
|13] . In this section, we will show that the NE scheme also achieves the boundary of the 
capacity region for multi-user broadcast Z channels. 
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The broadcast Z channel 
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Fig. 4 

The degraded version of the broadcast Z channel 



A. F* for the broadcast Z channel 

For the broadcast Z channel X ^ Y ^ Z shown in Fig. [3]^b) and Fig. H] with 





1 






1 


02 


Tyx — 






, Tzx — 


















/92 



where < ai < ^2 < 1, A = 1 — «i, and P2 = ^ — «2, one has 

A) = - A) = h{pP2) - Xh{pP,). 

Taking the second derivative of (f){p, A) with respect to p, we have 



0"(p,A) 



-PI 

'1-pP2)pP2 il-pf3i)pPi 
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(54) 



(55) 



(56) 







1 



Fig. 5 

Illustrations of 0(-,A) and for the broadcast Z channel 



which has the sign of 



p(p,A) =p/3i/52(l-A) + A/3i-/52. 



(57) 



Let /3a = (^2/(^1- For the case of /?a < A < 1, A) > for all < p < 1. Hence, 
(f){p,X) is convex in p and thus (f){p,X) = ip{p,X) for all < p < 1. In this case, the 
optimal transmission strategy achieving the maximum of -Ri also achieves the maximum of 
R2 + Ai?i, and the optimal transmission strategy has / = 1, i.e., t/ is a constant. Note that 
the transmission strategy with / = 1 is a special case of the NE scheme in which the only 
codeword for the second user is an all-ones codeword. 

For the case of < A < /5a, 4>{Pi ^) is concave in p on [0, ] and convex on 



[ /3i/32(i-A) '-*-]• '^^^ graph of 0(-,A) in this case is shown in Fig. [5l Since 0(0, A) = 0, 
'?/'(■, A), the lower convex envelope of 0(-, A), is constructed by drawing the tangent through 
the origin. Let {px,(f){p\,X)) be the point of contact. The value of p\ is determined by 
0p(PA, A) = (j){px,X)/Px, i-e.. 



Let q = (1 — g, q)'^ be the distribution of the channel input X. For q < px, ip{q, A) is 



ln(l-/?2PA) = Aln(l-/5lPA)• 



(58) 
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Fig. 6 

The optimal transmission strategy for the two-user broadcast Z channel 



obtained as a convex combination of points (0, 0) and {p\, (t>{px, A)) with weights {px — (l)/px 
and q/px- By Theorem[6], it corresponds to s = [{px — <l)/px]0 + [<l/Px]h{PiPx) = <lh{(3ipx)/px 
and F*{q, s) = q/px ■ h{P2Px)- Hence, for the broadcast Z channel, 



F^Trx^T.Ai^ qKM/p) = qKMIP (59) 

for p G [q', 1], which defines ^Tyx,Tzx^'1^ '^^ entire domain [qh{Pi), h{qPi)]. Also by 
Theorem [6l the optimal transmission strategy U ^ X achieving max{i?2 + ^Ri \ Xlj "^jPj ~ 
q} is determined by / = 2, Wi = {px - q)/px, ^2 = q/Px, Pi = (1, 0)^ and P2 = {l-px,PxV- 
Since the optimal transmission strategy U ^ X is a Z channel as shown in Fig. [6l the random 
variable X could also be constructed as the OR operation of two Bernoulli random variables 
with parameters {jpx — q)/px and 1—px respectively. Hence, the optimal transmission strategy 
for the broadcast Z channel is still the NE scheme in this case. For q > px, ip{q, A) = (j){q, A) 
and so the optimal transmission strategy has / = 1, i.e., ?7 is a constant. Therefore, we 
provide an alternative proof to show that the NE scheme achieves the whole boundary of 
the two-user broadcast Z channel. 

B. Multi-user broadcast Z channel 

Let X = {Xi,--- ,Xn) be a sequence of channel inputs to the broadcast Z channel 
X Y ^ Z satisfying (JSlj). The corresponding channel outputs are Y = (Yi,- ■■ ,Yn) 
and Z = {Zi, ■ ■ ■ , Z^)- Thus, the sequence of the channel outputs (Yi, Zi), i = 1, ■ ■ ■ , N, 
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are conditionally independent with each other given the channel inputs X. Note that the 
channel outputs (Fj, Z.j) do not have to be identically or independently distributed since 
Xi, ■ ■ ■ jX^r could be correlated and have different distributions. 

Lemma 5: Consider the Markov chain U ^ X ^ Z with Pr(Xj = 0)/X = g, if 



H{Y\U)>N-'^-h{(3ip), 
P 



(60) 



for some p G [q, 1] , then 



H{Z\U) >N-^-h{P2p) 
p 



= N-^-h{(3ipl3A), 
p 

The proof is given in Appendix IV. 

Consider a i^'-user broadcast Z channel with marginal transition probability matrices 



(61) 
(62) 



1 a,- 







where < ai < ■ ■ ■ < ax < 1, and [3j = 1 — aj for j = 1, 



(63) 



K. The K-user broadcast 



Z channel is stochastically degraded and can be modeled as a physically DBC as shown in 
Fig. [71 The NE scheme for the i^'-user broadcast Z channel is to independently encode the K 
users' information messages into K binary codewords and broadcast the binary OR of these 
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K encoded codewords. The j**^ user then successively decodes the messages for User K, User 
K — 1, ■ ■ ■ , and finally for User j. The codebook for the j^^ user is designed by random 
coding technique according to the binary random variable X^^^ with Fr{X^^^ = 0} = q^^\ 
Denote X^'^oX^^^ as the OR of X^'^ and X^^\ Hence, the channel input X is the OR of X^^^ 
for all l<3 < K, i.e., X = X^^^ o • • ■ o X^^\ From the coding theorem for DBCs [2] [S], the 
achievable region of the NE scheme for the i^'-user broadcast Z channel is determined by 

R, < I{Yj, ■ ■ ■ , (64) 

= H{Yj\X^^+^\ ■ ■ ■ ,X(^)) - H{Yj\X^^\X^^-^^\ ■ ■ ■ ,XW) (65) 

n 1^'^) ■ M^^- n ^^^^) - f n ^^^^) • m^^- n ^^^^) (^e) 

\i=j+l / i=l \i=j / i=l 

= f /^(/^.i.) - T^/^(/5.i.-i) (67) 

where tj = Y[i=i 1^^^ fo^' J = 1; ' ' ' ? a-nd q = Pr(X = 0) = Y[f=i ^^'^ ■ Denote to = 1- Since 
< q^^\--- < 1, one has 

l=h>ti>--->tK = q- (68) 



We now state and prove that the achievable region of the NE scheme is the capacity region 
for the multi-user broadcast Z channel. Fig. [8] shows the communication system for the K- 
user broadcast Z channel. X = {Xi, ■ ■ ■ ,Xn) is a length- codeword determined by the 
messages Wi, ■ ■ ■ , Wk- Yi, - ■ ■ , Yk are the channel outputs corresponding to the channel 
input X. 

Theorem 7: If Pr{Xj = 0}/N = q, then no point (i?i, ■ ■ ■ , Rk) such that 
i?, > M(M) - if7^(M-i)> J = h---,K 

(69) 

Rd = fh{^dtd) - j^h{^dtd-i) + 5, for some d G {1, ■ ■ ■ , K}, 5 > 
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Fig. 8 

The communication system for the multi-user broadcast Z channel 



is achievable, where the tj are as in (1671) and (!68l) . 



Proof (by contradiction): This proof borrows the idea of proving the converse of the coding 
theorem for broadcast Gaussian channels [2j. Lemma [5] plays the same role in this proof as 
the entropy power inequality does in the proof for broadcast Gaussian channels. 

We suppose that the rates of (169|) are achievable, which means that the probability of 
decoding error for each receiver can be upper bounded by an arbitrarily small e for sufficiently 
large N 

Pr{Wj^Wj\Yj}<e, j = l,---,K. (70) 
By Fano's inequality, this implies that 

H{W,\Yj)<h{e) + eHM,-l), j = l,---,K. (71) 
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Let o(e) represent any function of e such that o(e) > and o(e) — > as e — > 0. Equation 
( FfTl) imphes that j = 1, ■ ■ ■ , K, are all o(e). Therefore, 

H{Wj) = H{W,\Wj+i, ■■■ ,Wk) (72) 

= I{Wj; Yj\Wj+,, ■■■,Wk) + H{W,\Yj, ■■■,Wk) (73) 

<I{W,;Yj\Wj+,,--- ,Wk) + H{W,\Yj) (74) 

= H{Yj\W,+,, ■■■ ,Wk)~ H{Yj\W„ ■■■ ,Wk) + o(e), (75) 

where (1721) follows from the independence of the Wj, j = 1, - ■ ■ ,K. From (!69l) . (175!) and the 
fact that NRj < H{Wj), 

H{Yj\Wj+,, ■■■ ,Wk)- H{Yj\W„ W,+,, ■■■ ,Wk)> N^h{(3,t,) - N-^h{(3^t,_,) - o{e). 

tj tj-i 

(76) 

Next, using Lemma [5] and ( 176|) . we show in the Appendix V that 

H{Yk) > Nh{(3Kq) + N6- o(e), (77) 

where q = tx = Xlili Pr(^i = 0)/^- Since e can be arbitrarily small for sufficient large N ^ 
o(e) ^ as ^ oo. For sufficiently large N, H{Yk) > Nh{pKq) + N6/2. However, it 
contradicts 

TV 



H{YK)<Y,Hi'^K,^) (78) 

j=l 

N 

= Y,h{f3K■PT{X^ = 0)) (79) 

i=l 

N 

<Nh{PK-J2^r{X, = 0)/N) (80) 

i=l 

= Nh{f3Kq). (81) 
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Some of these steps are justified as follows: 
. (CHD follows from Yk = {Yk,i, ■ ■ ■ , Yk,n); 

• flSOp is obtained by applying Jensen's inequality to the concave function /i(-); 
. dUD follows from q = Pr(Xi = 0)/N. 

The desired contradiction has been obtained, so the theorem is proved. 

VI. Input- Symmetric Degraded Broadcast Channels 

The input-symmetric channel was first introduced in [10] and studied further in [16] [17] 
|18] . The definition of the input-symmetric channel is as follows: Let denote the sym- 
metric group of permutations of n objects by the n x n permutation matrices. An ra-input 
m-output channel with transition probability matrix T^xn is input-symmetric if the set 

gT = {Ge <l>„|3n e s.t. TG = UT} (82) 

is transitive, which means each element of {1, ■ ■ ■ , n} can be mapped to every other element 
of {1, ■ ■ ■ ,n} by some permutation matrix in Qt [10]. An important property of input- 
symmetric channel is that the uniform distribution achieves capacity. 

Extend the definition of the input-symmetric channel to the input-symmetric DBC as 
follows: 

Definition 2: Input-Symmetric Degraded Broadcast Channel: A discrete memoryless DBC 
X Y ^ Z with \X\ = k, \y\ = n and \Z\ = m is input-symmetric if the set 

Qtyx,Tzx = Qtyx n Qtzx (83) 

= {Ge $fc|3nyx G <I'„,n^x e s.t. TyxG = UyxTyx,TzxG = UzxTzx} 

(84) 

is transitive. 

Lemma 6: Qtyx,Tzx a group under matrix multiplication. 
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Proof: Every closed subset of a group is a group. Since Gtyx,Tzx a subset of 
which is a group under matrix multiphcation, it suffices to show that Gtyx,Tzx closed 
under matrix multiphcation. Suppose Gi,G2 ^ Gtyx,Tzx ^^^^ ^^^^ TyxGi = Hyx.i^yx, 
TzxGi = Ilzx,iTzx, TyxG2 = ^yx,2Tyx and TzxG2 = ^zx,2Tzx- Thus, 

TyxGiG2 = IIyx,i^yx,2Tyx, (85) 

and 

TzxGiG2 = ^zx,i^zx,2Tzx- (86) 

Therefore, G1G2 E Gtyx,Tzx- Q-E-D- 

Let / = \Gtyx,Tzx \ Gtyx,Tzx — {Gi, ■ ■ ■ , Gi}. 

Lemma 7: Yl\=iGi = ^^^'^y where ^ is an integer and 1 is an all-ones vector. 
Proof: Since Gtyx,Tzx ^ g™up, for all j = 1, ■ ■ ■ , /, 

Y^gA =Y^G,G, = Y,G^. (87) 

1=1 / 1=1 i=l 

Hence, Yl\=iGi has k identical columns and k identical rows since Gtyx,Tzx transitive. 
Therefore, ZLiGi = fH^- Q-E.D. 

Definition 3: A subset of Qtyx,Tzx'- {Gii,-'' j^'j^^} is a smallest transitive subset of 
Styx,Tzx if 

where j is the smallest possible integer for which (188|) is satisfied. 

A. Examples: broadcast binary- symmetric channels and broadcast binary- erasure channels 

The class of input-symmetric DBCs includes most of the common discrete memoryless 
DBCs. For example, the broadcast binary-symmetric channel X ^ Y Z with marginal 

30 



transition probability matrices 



YX 



1 — ai «! 

«! 1 — «! 



and Tzx 



I — a2 a2 

a2 1 — 02 



where < ai < ^2 < 1/2, is input-symmetric since 



'Tyx,Tzx 



1 
1 



1 

1 



(89) 



is transitive. 

Another interesting example is the broadcast binary-erasure channel with marginal tran- 
sition probability matrices 



YX 



l-ai 
tti ai 
1 - ai 



and Tzx 



1-02 
02 (32 

1 - aa 



where < oi < 02 < 1. It is input-symmetric since its Qtyx,Tzx same as that of the 

broadcast binary-symmetric channel shown in ( |89l) . 

B. Example: group- additive DBC 

Definition J^: Group-additive Degraded Broadcast Channel: A degraded broadcast channel 
X Y ^ Z with X, F, Z G {1, ■ ■ ■ , n} is a group-additive degraded broadcast channel if 
there exist two ra-ary random variables N\ and A^2 such that y ~ X © Xi and Z ® 1^2 
as shown in Fig. [9l where ~ denotes identical distribution and © denotes group addition. 

The class of group-additive DBCs includes the broadcast binary-symmetric channel and 
the discrete additive DBC [llj as special cases. 

Theorem 8: Group-additive DBCs are input-symmetric. 
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Fig. 9 

The group- additive degraded broadcast channel. 



Proof: For the group- additive DBC X ^ Z with X,Y, Z e {1, ■ ■ ■ , n}, let for 

X — 1, ■ ■ ■ ,n, he 0-1 matrices with entries 

{1 if j e X = i 
forz,j = l,... ,n. (90) 
otherwise 

Gx ioT X — 1, • • ■ ,n, are actually permutation matrices and have the property that G^i ■Gx2 — 
Gx2 ■ Gxi — Gxi®x2- Let (70, • • • ,7„_i)'^ be the distribution of A^i. Since Y has the same 
distribution as X © iVi , one has 

n 

Tyx = Y.^.Gx. (91) 

x=l 

Hence, TyxGx = GxTyx for all x = 1, • • • ,n. Similarly, we have TzxGx = GxTzx for all 
X — 1, - ■ ■ ,n, and so 

{Gi, • • • , Gn} c g (92) 

Since the set {Gi, • • • , Gn} is transitive by definition, Qtyx,Tzx is also transitive and hence 
the group-additive degraded broadcast channel is input-symmetric. Q.E.D. 

By definition, ^"=1 Gj = 11^, and hence, {Gi, • • • , G„} is a smallest transitive subset of 
Qtyx,Tzx the group-additive DBC. 
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C. Example: IS-DBC not covered in [16] [17] 



The class of DDICs and the corresponding DBCs studied in [16] [T7j have to satisfy 
the condition that the transition probabihty matrix Tzy is input-symmetric, i.e., Qtzy 
transitive. The input-symmetric DBC, however, does not have to satisfy this condition. The 
following example provides an IS-DBC which is not covered in [16] [17] . Consider a DBC 
X ^ Y ^ Z with transition probability matrices 



YX 



a c 

b d 

c a 

d b 



ZY 



e f g h 
g h e f 



and 



zx 



TzyTyx 



a (3 
(3 a 



(93) 



where a + c = b + d= 1, e + f + g + h = 1, a = ae + bf + eg + dh and fi = ag + bh + ce + df. 
This DBC is input-symmetric since its Gtyx,Tzx the same as that of the broadcast binary- 
symmetric channel shown in ( l89l) . It is not covered in [16] [IT] because 
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(94) 



is not transitive. 
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D. Optimal input distribution and capacity region 

Consider the input-symmetric DBC X ^ Y —>■ Z with the marginal transition probabihty 
matrices Tyx and Tzx- Recall that the set C is the set of all {p,^,ri) satisfying Q) (ITOl) 
and ffTTl) for some choice of I, w and pj, j = 1, ■ ■ ■ ,1, the set C* = {{C,,v)\{q^^^v) G C } is 
the projection of the set C on the ?7)-plane, and the set C* = {{C,,f])\{Qy^yV) £ C } is the 
projection on the ■?7)-plane of the intersection of C with the two-dimensional plane p = q. 

Lemma 8: For any permutation matrix G G Qtyx^Tzx (P; ^) v) ^ ^i one has (Gp, ^, rf) G 

C. 

Proof: Since {p,^,ri) satisfying ([9]) ( fTOl) and ( fTTi) for some choice of /, it> and Pj, 

I 

J2 WjGpj = Gp (95) 
i=i 

J2 wMTyxGpj) = J2 WjhniUYxTyxPj) = e (96) 
i=i i=i 

^WjhmiTzxGpj) = Wjhni{IiYxTzxPj) = V- (97) 

Hence, {Gp,^,rf) satisfying ([9]) (ITOl) and (fTT]) for the choice of /, w and Gp^j = 1, ■ ■ ■ , /. 
Q.E.D. 

Corollary 1: Wp E Ak and G G ^t^^.Tzx; has C^p = C*, and so F*{Gp, s) = F*{p, s) 
for any H{Y\X) <s< H{Y). 

Lemma 9: For any input-symmetric DBC, C* = C^, where u denotes the uniform distri- 
bution. 

Proof: For any {Crj) G C*, there exits a distribution p such that {p,^,ri) G C. Let 
^^T^x.Tzx = {Gi,--- ,Gi}. By Corollary [H (G',p,e,r/) G C for all j = I,-- - By the 
convexity of the set C, 

I 

(9,e,^) = ($^G,P,e,r7)GC, (98) 
i=i 

34 



where q — Gjp. Since Qtyx,Tzx ^ group , for any permutation matrix G' e Gtyx,Tzxj 

I I 

G'q =Yp'G,P =YpjP = q (99) 

3=1 i=i 

Since G'q = q, the i^^ entry and the j^^ entry of q are the same if G' permutes the z'^' row to 
the j^^ row. Since the set Gtyx,Tzx input-symmetric DBC is transitive, aU the entries 

of q are the same, and so q = u. This imphes that (^,77) G C^. Since (^,77) is arbitrarily 
taken from C*, one has C* C C^. On the other hand, by definition, C* D C*^. Therefore, 
C* = C;. Q.E.D. 

Now we state and prove that the uniformly distributed X is optimal for input-symmetric 
DBCs. 

Theorem 9: For any input- symmetric DBC, its capacity region can be achieved by using 
the transmission strategies such that the broadcast signal X is uniformly distributed. As a 
consequence, the capacity region is 

CO R2): Ri<s- hniTyxei), R2 < hm{Tzxu) - F^Yx,Tzxi'^^ •5)' hniTyxei) <s< ln(n)} , 

(100) 

where Ci = (1, 0, • • • , 0)^, n— \y\, and m — \Z\. 

Proof: Let q — {qi, - ■ ■ , qk)^ be the distribution of the channel input X for the input- 
symmetric DBC X ^ Y ^ Z . Since Gtyx transitive, the columns of Tyx are permutations 
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of each other. 

k 

H{Y\X) = Y,H{Y\X = {) (101) 

i=l 
k 

= Y,<lihn{TYxei) (102) 
1=1 

k 

= J2qMTyxe^) (103) 

1=1 

= KiTyxei), (104) 
which is independent of q. Let / = \Gtyx,Tzx \ Gtyx,Tzx — i^ii ' ' ' > ^i)- 

H{Z) = hm{Tzxq) (105) 
= h^Tzxq) (106) 
1 ' 

= -Y^hmiTzxG^q) (107) 

^ i=i 

1 ' 

<hmiTzxjJ2G^q) (108) 

' i=i 

= /i™(Tzxw), (109) 
where (11081) follows from Jensen's inequality. Since C* = C*^ for the input-symmetric DBC, 

F*{q,s)>F*{u,s). (110) 
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Plugging (11041) . fllOOp and (11 101) into ([7]), the expression of the capacity region for the DBC, 
the capacity region for input-symmetric DBCs is 



CO 



C CO 



U R2):Ri<s- H{Y\X), i?2 < H{Z) - F*^^^^^^{q, s)] 



[j {(i?i,i?2) ■.Ri<s- h^{TYxei),R2 < hmiTzxu) - F*^^j,^^{u,s)} 



(112) 

co{(i?i,i?2) : ^1 <s-KiTyxe^),R2 < h^TzKu) - F*^^^t^^{u, s)} (113) 
co{{RuR2) ■.px = u,R^< s-H{Y\X),R2 < H{Z) - F^^^^^^^iu, s)} (114) 



Ceo 



U {iRuR2):Ri<s- H{Y\X),R2 < H{Z) - F^^^^^^^iq, s)] 



(115) 



Note that (II 111) and (IllSp are identical expressions, hence (lllll - [TT5|) are all equal. There- 
fore, (llOOl) and (I113P express the capacity region for the input-symmetric DBC, which also 
means that the capacity region can be achieved by using transmission strategies where the 
broadcast signal X is uniformly distributed. Q.E.D. 



E. Permutation encoding approach and its optimality 

The permutation encoding approach is an independent-encoding scheme which achieves 
the capacity region for input-symmetric DBCs. The block diagram of this approach is shown 
in Fig. [TUJ In Fig. [TDl Wi is the message for User 1, which sees the better channel Tyx, and 
W2 is the message for User 2, which sees the worse channel Tzx- The permutation encoding 
approach is first to independently encode these two messages into two codewords and 
X2, and then to combine these two independent codewords using a single-letter operation. 

Let Qs be a smallest transitive subset of Gtyx,Tzx- Denote A; = lA"! and Ig = \Qs\- Use a 
random coding technique to design the codebook for User 1 according to the A;-ary random 
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Fig. 10 

The block diagram of the permutation encoding approach 



Decoder 1 ► Zj , W^ 

1, 



Decoder 2 



Fig. 11 

The structure of the successive decoder for input-symmetric DBCs 



variable Xi with distribution pi and the codebook for User 2 according to the /-ary random 
variable X2 with uniform distribution. Let Qs = {Gi,--- iGiJ\. Define the permutation 
function gx^{xi) = x if the permutation matrix G^^ maps the x^^ column to the x^^ column, 
where X2 G {1, ■ ■ ■ .h} and x, Xi G {1, ■ ■ ■ Hence, gx2{xi) = x if and only if the x^^ 
row, x**^ column entry of is 1- The permutation encoding approach is then to broadcast 
X which is obtained by applying the single- letter permutation function X = (7x2 (-^1) on 
symbols of codewords Xi and X2. Since X2 is uniformly distributed and ^^=1 Gj = jH^, 
the broadcast signal X is also uniformly distributed. 

User 2 receives Z and decodes the desired message directly. User 1 receives Y and succes- 
sively decodes the message for User 2 and then for User 1. The structure of the successive 
decoder is shown in Fig. [11] Note that Decoder 1 in Fig. [11] is not a joint decoder even 
though it has two inputs Y and X2. 

In particular, for the group-additive DEC with Y ~ X © A^i and Z ~ Y (B N2, the 
permutation function (7x2(2^1) is the group addition X2 ©Xi. Hence the permutation encoding 
approach for the group-additive DEC is the NE scheme for the group- additive DEC. The 
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Fig. 12 

The structure of the successive decoder for degraded group-addition channels 



successive decoder for the group-additive DBC is shown in Fig. [T21 where 

y = y®i-x2). (116) 



From the coding theorem for DBCs [2J [3J, the achievable region of the permutation en- 
coding approach for the input-symmetric DBC is determined by 

Ri<I{X;Y\X2) (117) 
= H{Y\X2)-H{Y\X) (118) 

Is k 



Pr(X2 = X2)H{Y\X2 = X2)-Y^ Pr(X = x)H{Y\X = x) (119) 

X2=l X=l 

Is k 

J2 Pr(^2 = X2)K{TyxG,,p^) - J2 = x)hn{TYxe.) (120) 

X2 = l X = l 

Is k 

J2 Pr(^2 = X2)hn{UYx,x,TYxPi) - = ^)hniTYxei) (121) 

2:2=1 x=l 

hniTYxPi) - hniTYxBi), (122) 
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and 



R2 < /(X2; Z) (123) 
= H{Z) - H{Z\X2) (124) 

Is 

= K,{Tzxu) - J2 Pr(^2 = X2)hUTzxG^,pi) (125) 

X2 = l 
Is 

= h^{Tzxu) - J2 PK^2 = X2)hm{Ilzx,x,TzxPi) (126) 

X2 = l 

= hmiTzxu) - hm{TzxPi), (127) 

(128) 

where u is the fc-ary uniform distribution, is the distribution of Xi, and is a 0-1 vector 
such that the x^^ entry is 1 and all other entries are 0. Hence, the achievable region is 



CO 



U {{Ri, R2) : Ri < K{TyxPi) - hniTyxei), R2 < h^{Tzxu) - hUTzxPi)} 
.pieAfe 



(129) 



Define F{s) as the infimum of hmiTzxPi) with respect to all distributions pi such that 
hniTyxPi) = s. Hence the achievable region fll29p can be expressed as 

{(i?i,i?2) :Ri<s- K{TYxei),R2 < hmiTzxu) - eiwF{s), hniTyxei) < s < /i„(Tyxn)} , 

(130) 

where enyF(s) denotes the lower convex envelope of F{s). In order to show that the achiev- 
able region (11301) is the same as the capacity region fllOOl) for the input-symmetric DEC, it 
suffices to show that 

envF(g) < F*(u,s) (131) 
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For any U ^ X with uniformly distributed X, 



H{Z\U) = = u)H{Z\U = u) 



(132) 



u 



\U=u) 



(133) 



u 



> 



Y,MU = u)F{K{TyxPx\U=u)) 



(134) 



u 



> 



J2 Pr(f/ = u)envF{hn{TYxPx\u=u)) 



(135) 



u 



> enyF(5^Pr(t/ = u)h^{TYxPx\u=u)) 



(136) 



u 



envF{H{Y\U)), 



(137) 



where Px\u=u is the conditional distribution of X given U = u. Some of these steps are 
justified as follows: 



• fll36p follows from Jensen's inequality. 
Therefore, by definition, env-F(s) < F*(n, s). 

The results of this subsection may be summarized in the following theorem. 

Theorem 10: The permutation encoding approach achieves the capacity region for input- 
symmetric DBCs, which is expressed in fll29p (11301) and fllOOp . 

Corollary 2: The group- addition encoding approach achieves the capacity region for group- 
additive degraded broadcast channels. 

Conjecture 1: The alphabet size of the code for User 2, Is, is equal to the alphabet size of 
the channel input, k, in a permutation encoding approach for any input-symmetric DBG. In 
other words, a smallest transitive subset {Gi, ■ ■ ■ , Gi^} of Qtyx,Tzx ^'^^ input-symmetric 




DBG has 



Is 




(138) 
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X — ^0 — — -0 — 

Fig. 13 

The discrete degraded broadcast multiplication channel. 



VII. Discrete Multiplicative Degraded Broadcast Channels 



Definition 5: Discrete Multiplicative Degraded Broadcast Channel: A discrete DBC X —>■ 

Y ^ Z with X,Y, Z & {0, 1, ■ ■ ■ , n} is a discrete multiplicative degraded broadcast channel 
if there exist two {n + l)-ary random variables A^i and N2 such that Y X ^ Ni and 
Z Y ^ N2 as shown in Fig. [T31 where C?> denotes discrete multiplication. 

By the definition of discrete multiplication and group addition, the multiplication of zero 
and any element in {0, 1, ■ ■ ■ , n} is always zero and {1, ■ ■ ■ ,n} under the discrete multipli- 
cation operation forms a group. Hence, the discrete DBC X Y —>■ Z has the channel 
structure as shown in Fig. [TH The sub-channel X — > F ^ Z is a group- additive DBC with 
marginal distributions Tyx and T^-^ = '^zy'^yx^ where X , y, Z = {1, ■ ■ ■ ,n}. For the dis- 
crete multiplicative DBC X ^ Y ^ Z, if the channel input X is zero, the channel outputs 

Y and Z are zeros for sure. If the channel input is a non-zero symbol, the channel output Y 
is zero with probability ai and Z is zero with probability 0:2, where 02 = cti + (1 ~ ai)aA- 
Therefore, the marginal transmission probability matrices for X ^ Y ^ Z are 



YX 



1 aiV 
0(1- ai)TYx 



TzY 



1 oaI^ 
(1 - ai^)TzY 



(139) 
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Fig. 14 

The channel structure of a DBC with erasures. 



Tzx — TzyTyx 



1 ail^ 
(1 - ai)TYx 



1 a^l^ 
(1 - aA)TzY 



1 a^l^ 

(1 - Q!2)7>y 



(140) 



where 1 is an all-ones vector and is an all-zeros vector. 



A. Optimal input distribution 



The sub-channel X — > y — > Z is a group- additive DBC, and hence, ^t^^.t^^ is transitive. 
For any n x n permutation matrix G e ^Ty^,T^^ with TyxG = UyxTyx and T^xG — 
^zx'^zxi (n -|- 1) X (n -|- 1) permutation matrix 



G 



1 0"^ 
G 



(141) 



has 





1 ail^ 




1 


0^" 




"l 0"^ " 


TyxG — 














(1 - ai)TYx 







G 




n^;^ 



YX: 



(142) 



and so G e Qtyx- Similarly, G e Gtzx-i hence G e Qtyx,Tzx- Therefore, any non-zero 
element in {0, 1, • • • , n\ can be mapped to any other non-zero element in {0, 1, • • • , n} by 
some permutation matrix in Qtyx,Tzx-i however, no matrix in Qtyx,Tzx ^laps zero to non-zero 
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element or non-zero element to zero. Hence, any permutation matrix G G Qtyx.Tzx 



G 



1 0^ 
G 



(143) 



for some G G GTy<^,T^<^- These results may be summarized in the following Lemma: 
Lemma 10: Let Qt^ = {Gi, ■ ■ ■ , Gi}- Hence, Qtyx,Tzx = {^i, ■ ■ • , Gi}, where 



G, 



1 0^ 



G, 



(144) 



for j = 1, . . . , /. 

Now we state and prove that the uniformly distributed X is optimal for the discrete 
degraded broadcast multiplication channel. 

Lemma 11: Let Px = (1 — Q', QPj^)^ ^ ^n+i be the distribution of channel input X, where 
Px is the distribution of X. For any discrete multiplicative DEC, C*^ C C(^_^g^T-)T and 
C* = IJge[o 1] ^(i-ggiiT)^) where u G A„ denotes the uniform distribution. 

The proof of Lemma [TT] is similar to that of Lemma [9] and the details are given in Appendix 
VI. 

Theorem 11: The capacity region of the discrete multiplicative degraded broadcast channel 
can be achieved by using transmission strategies where X is uniformly distributed, i.e., the 
distribution of X has Px = (1 — <?, qu^Y^ for some q G [0, 1]. As a consequence, the capacity 
region is 



=1^ 



co[ U {{Ri,R2):Ri<s-qK{Tyxe 

R2 < h{{l - a2)q) + (1 - a2)gln(n) - F^,^,t,,((1 - q, qu^f, s)} 



■qe[o,i] 



(145) 
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Proof: Let Px = (1 — q,qPx)'^ be the distribution of the channel input X, where = 
(pi, ■ ■ ■ ,PnY ■ Since Gty<^ is transitive and the columns of Tyx are permutations of each 
other. 

n 

H{Y\X) = Y,H{Y\X = i) (146) 

i=0 

n 

= (1 - q)H{Y\X = 0) + J] qp^Tyx^i) (147) 

n 

= ^gMnfeei) (148) 

i=l 

= qK{Tyxe^), (149) 

which is independent of Px- Let Gtyx,Tzx = {Gi,- ■ ■ ,Gi}. 

H{Z) = K+^{TzxPx) (150) 
1 ' 

= jY.^n+i{TzxG^Px) (151) 

i=l 

1 ' 

<hn+iiTzxjJ2^^P^^ (1^2) 

= /^n+i(Tzx(i-g,gt*T) (153) 

= /i((l-a2)g) + (l-a2)gln(n) (154) 

where fll52p follows from Jensen's inequality. Since C*^ C C*i__g quT^t for the discrete multi- 
plicative DBC, 

F*{px, s)>F*{{l- q, qu'f, s). (155) 
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Plugging fll49p . fll54p and fll55p into ([7]), the capacity region for discrete multiplicative DBCs 
is 



co[ U {{RuR2):Ri<s-H{Y\X), 

C co[ (j {{Ri,R2): Ri<s-K{Ty^e^), 
R2 < h{{l - a2)q) + (1 - a2)qln{n) 

= U {(-^i'-^2) : -Ri < s - g/i„(Ty^ei), 

ge[0,l] 

R2 < h{{l - a2)q) + (1 - a2)gln(n 
U {{Ri,R2):Ri<s-H{Y\X), 

Px = (l-9.'?'"^)^ 



CO 



R2<H{Z)-F*^^^^^^{q,s)] 
Cco[ U {(i?i,i?2) : i?i < s-if(r|X) 

R2<H{Z)~F^^^^^^^{q,s)] 



(156) 



(157) 



(158) 



(159) 



(160) 



where co denotes the convex hull of the closure. Note that fll56p and fll60p are identical 
expressions, hence (11561 - 11601) are all equal. Therefore, (11580 expresses the capacity region 
for the DM-DBC, which also means that the capacity region can be achieved by using 
transmission strategies where the broadcast signal X has distribution Px = (1 — qu^Y" 
for some q G [0, 1]. Q.E.D. 
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Fig. 15 

The block diagram of the NE scheme for the discrete multiplicative DBC. 



B. Optimality of the NE scheme for DM-DBCs 



The NE scheme for the discrete multiphcative DBC is shown in Fig. [151 Wi is the message 
for User 1 who sees the better channel Tyx and W2 is the message for User 2 who sees the 
worse channel Tzx- The NE scheme is first to independently encode these two messages 
into two codewords Xi and X2 respectively, and then to broadcast X which is obtained 
by applying the single-letter multiplication function X = X2 ® Xi on symbols of codewords 
Xi and X2. The distribution of X2 is constrained to be = (1 ~ Q^Q'^^)^ for some 
q E [0, 1] and hence the distribution of the broadcast signal X also has Px = (1 — 9, qu^Y 
for some q G [0, 1], which was proved to be the optimal input distribution for the discrete 
multiphcative DBC. User 2 receives Z and decodes the desired message directly. User 1 
receives Y and successively decodes the message for User 2 and then for User 1. 



Let Px = (1 — QiQPx)'^ be the distribution of the channel input X, where Px is the 
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distribution of sub-channel input X. For the discrete multiphcative DBC X ^ Y ^ Z , 



A) = hn+iiTzxPx) - A/l„+l(TyxPx) 
1 - q + qa2 
g(l - a2)T2xPx 



h 



n+ll 



Xh 



n+ll 



) 



(161) 
(162) 



I - q + qai 
q{l - ai)TyxPx_ 

h{q{l - 02)) - g(l - a2)K{TzxPx) - A {h{q{l - ai)) - g(l - ai)K{TY xP x)) 

(163) 



h{q(32) - \h{qj3i) + ql32{hn{TzxPx) 



X 



1 — ftA 



(164) 



where Pi = 1 — ai and /?2 = 1 — «2- 

For the sub-channel X ^ Y ^ Z , define 4>{Px^ j ^ 
Define '^{q,Px, A) as follows: 



a A ' 



^niT^xPx) i-aA^'^^'^YxPx) 



^{q^Px, A) = h{q(32) - \h{ql3i) + qp2il^{Px^ — 



A 



OA 



(165) 



where ijj is the lower envelope of <t>{px, i-aA ^ ^"^Px ■ With this definition, note that 'ipiPx^ A), 
the lower envelope of (p{Px, A), is also the lower envelope of f{q,Px, A). 

Lemma 12: '?/'((l — g, gn^)"^, A), the lower envelope oi (j){pxiX) inp^^ at Px = {^^QiQ'^^Y' 
is on the lower envelope of <^(g, A) in q. 

The proof is given in Appendix VII. Lemma [T2] indicates that the lower envelope of 0(-, A) 
at Px = (1 — g, qu^Y' can be decomposed into two steps. First, for any fixed g, the lower 
envelope of (pipxi A) in Px is (/^(g, Pxi A). Second, for p^j^ = u, the lower envelope of ip{q, u, A) 
in g coincides with 'ipiPx, A), the lower envelope of (p{Px, A) in Px- 

Now we state and prove that the NE scheme is optimal for the discrete multiplicative 
DBC. 

Theorem 12: The NE scheme with time sharing achieves the boundary of the capacity 
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region for the discrete multiplicative degraded broadcast channel. 

Proof : Theorem [TT] shows that the boundary of the capacity region for the discrete mul- 
tiplicative DBC can be achieved by using transmission strategies with uniformly distributed 
X, i.e., the input distribution = {l — q,qu^)'^. For = (1 — g, gw^)'^, '(p{{l — q,qu^)'^,\) 
can be attained by the convex combination of points on the graph of ip{q, u, A). Since 

ifiq, u, A) = h{qf32) - Xh{q(3i) + qM{u, -^), (166) 

1 - ttA 

which is the sum of A) for the broadcast Z channel and q times the constant P24'{'U-, j^^)- 
Hence, by a discussion analogous to Section |Vl ipdl — q,qu^)^ , \) can be attained by the 
convex combination of 2 points on the graph of ip{q,u,X). One point is at g = and 
ip{0, u, A) = 0. The other point is at g = px, where px is determined by ln(l — /32Pa) = 
Aln(l-ApA). 

Note that the point (0,0) on the graph of ip{q, u, A) is also on the graph of (p{Px, X)- By 
Theorem [6l the point {px, ^{pxi u, A)) is the convex combination of n points on the graph of 
0(p^,A), which corresponds to the group-addition encoding approach for the sub-channel 
X Y Z because the group-addition encoding approach is the optimal NE scheme for 
the group- additive DBC X ^ Y ^ Z . Therefore, by Theorem [6l an optimal transmission 
strategy for the discrete multiplicative DBC X ^ Y —>■ Z has the NE structure as shown in 
Fig.[Ii Q.E.D. 

If the auxiliary random variable U = 0, then the channel input X = 0. If f/ is a non-zero 
symbol, then X = with probability 1 —px- In the case where U and X are both non-zero, X 
can be obtained as X = U(BV, where © is a group operation equivalent to group addition in 
the group- additive degraded broadcast sub-channel X ^ Y ^ Z, U is uniformly distributed 
and V is an n-ary random variable. 

Since the NE scheme is optimal for discrete multiplicative DBCs, its achievable rate region 
is the capacity region for discrete multiplicative DBCs. Hence, the capacity region for the 
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The optimal transmission strategy for the discrete multiplicative degraded broadcast 

CHANNEL 



discrete multiplicative DBC in Fig. [T3] is 



CO 



U {{Ri,R2) : R2 < H{U®V®N2)-H{U®V®N2\U) 



Ri<H{U(S)V(S)Ni\U)-H{U®V^Ni\U^V)} . (167) 

VIII. Conclusions 

This paper explores relatively simple optimal encoders for the degraded broadcast chan- 
nel. By extending the input-symmetry and F{-) ideas of Wyner and Witsenhausen, this 
paper introduced and proved the optimality of the permutation encoding approach for input- 
symmetric DBCs and showed that natural encoding can achieve the boundary of the capacity 
region for the multi-user broadcast Z channel, any two-user group-additive DBC, and (by 
combining the previous two results) the two-user discrete multiplicative DBC. Along the 
way, this paper has provided closed-form expressions for the capacity regions of these DBCs. 
In conclusion, natural encoding achieves the capacity region of DBCs much more often that 
has been previously known. In fact, it would seem that there are more such cases where 
natural encoding achieves the DBC capacity region waiting to be identified. It remains an 
open problem to prove a general theorem establishing the optimality of natural encoding 
over a suitably large class of DBCs. The results of this paper also open interesting problems 
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in channel coding to find practical channel codes for the DBCs examined in this paper. 
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Appendices 

Appendix I 

The appendix presents a simple independent encoding scheme made known to us by Telatar 
[5] which achieves the capacity region for DBCs. The scheme generalizes to any number of 
receivers, but showing the two-user case suffices to explain the approach. It indicates that 
any achievable rate pair (i?i,i?2) for a DEC can be achieved by combining symbols from 
independent encoders with a single-letter function. The independent encoders operate using 
two codebooks {v'^{i) : i = 1, ■ ■ ■ ,2"-^i}, {^"(j) : j = 1, • ■ ■ ,2"-^2j q^^^ ^ single-letter 
function f{v,u). In order to transmit the message pair the transmitter sends the 

sequence f{vi{i), ■ ■ ■ , f{vn{i), Un{j))- The scheme is described below: 

Lemma 13: Suppose U and X are discrete random variables with joint distribution pux- 
There exist V independent of U and a deterministic function / such that the pair ([/, /(V, U)) 
has joint distribution pux- [S] 

Proof: Suppose U and X take values in {1, ■ ■ ■ , /} and {1, ■ ■ ■ , fc} respectively. Let 
V = (Vi, ■ ■ ■ ,Vi), independent of U, be a random variable taking values in {1, ■ ■ ■ , /c}' 
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with PT{Vj = i) =px\u{.i\3)- Set /((wi, 



,vi),u) = Vu- Then we have 



Pr(t/ = M, fiy, U)=x) = Pr(f/ = u,Vu = x) 

= Pi{U = u)Pr{Vu = x) 
= Puiu)px\uix\u) 

= Pux{u,x). (168) 

Q.E.D. 

If the rate pair (i?i,i?2) is achievable for a degraded broadcast channel X ^ Y ^ Z, 
there exists an auxiliary random variable U such that 

(a) U ^ X ^ Z; 

(b) I{X;Y\U)>R,; 

(c) I{U- Z) > R2. (169) 

Use Lemma fT3l to find V independent of U and the deterministic function f{v,u) such that 
the pair {U, /(V, U)) has the same joint distribution as that of {U,X). Randomly and inde- 
pendently choose codewords |t'"(l), ■ ■ ■ ,^"(2"^^)} according to p(f") = py(fi) ■ ■ ■ ,pv{vn), 
and choose codewords {^"(1), ■ ■ ■ ,'u"'(2"^2)} according to p(-u") = puiui) ■ ■ ■ ,puiun)- To 
send message pair the encoder transmits f {vi{i) , ui{j)) , ■ ■ ■ , f {vn{i) , Un{j)) ■ 

Using a typical-set-decoding random-coding argument, the weak decoder, given z^, searches 
for the unique j' such that (z",-u"(j')) is jointly typical. The error probability converges to 
zero as n goes to infinity since R2 < I^U; Z). The strong decoder, given y", also searches 
for the unique j' such that [y^, u^{j')) is jointly typical, and then searches for the unique i' 
such that (y", f "(i')) is jointly typical given u"'{j'). The error probability converges to zero 
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as n goes to infinity since 



R2<I{U- Z)<I{U-Y), 



(170) 



and 



Ri < I{X;Y\U) 



H{Y\U) 



H{Y\f{y,U),U) 



< H{Y\U) 



H{Y\f{V,U),U,V) 



H{Y\U) 



H{Y\U,V) 



I{V;Y\U). 



(171) 



Appendix II 

Proof of LemmalB Part i) is a consequence of Lemma [1] by applying tlie Fencliel-Eggleston 
strengtfiening of Caratheodory's Theorem pi] (Tlieorem 18(i)(ii), p. 35). If a compact set 
S has d or fewer connected components, and the set C with dimension d is the convex hull of 
S, then the Fenchel-Eggleston Theorem shows that every point in C is a convex combination 
of d or few points of S. Since the dimension of C in this paper is d = k + 1, every point of C 
can be achieved by ([2]) ([10]) and (HI]) with / < A; + 1. 

ii) Dubins' Theorem [19] (Theorem 3.6.20, p. 116) shows that if a set C is convex and compact, 
then every extreme point of the intersection of C with d hyperplanes is a convex combination 
of + 1 or fewer extreme points of C. A two-dimensional plane in {k + 1) dimensions can 
be considered as the intersection of {k + 1) — 2 = k — 1 hyperplanes. Thus every extreme 
point of the intersection of C with a two-dimensional plane is a convex combination oil <k 
extreme points of C. Part ii) is then proved by the fact that every extreme point of C belongs 
to 5. Q.E.D. 
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Appendix III 

Proof of Theorem\^ i) The point {q,ip{q, )^)) is a point on the lower boundary of Cx 
which is obtained as a convex combination of the points {pj,(j){pj, X)) of S\ with weights 
Wj. By ( H5|) . the transmission strategy U ^ X determined by \U\ = I, Pr(f/ = j) = Wj 
and Px\u=j = Pj? J = 1? ■ ■ ■ J achieves the maximum of R2 + A-Ri subject to the constraint 
Px = q- Thus, the point {q,'^^ WjhniTyxPj) j'^j Wjhm(TzxPj)) is on the lower boundary 
of C, and hence of C*. It implies that the graph of F*{q, ■) is supported by a line of slope A at 
that point, and thus (l46l) holds. For Part ii), if the transmission strategy U X determined 
by \U\ = I, Pr(f/ = j) = Wj and Px\u=j = Pji J = 1; ' ' ' ; ^ achieves the maximum of -R2 + A-R1 
subject to the constraint Px = q, the point (q, J2j WjhniTyxPj), Tlj WjhmiTzxPj)) is on the 
lower boundary of C*, and at this point the graph of ■) is supported by a line of slope 

A. Thus, the point (qr, ^/'(g. A)) is the convex combination of I points of the graph of (/)(■, A) 
with arguments p^ and weights Wj, j = 1, ■ ■ ■ ,1. Q.E.D. 

Appendix IV 

Proof of Lemma Lemma [5] is the consequence of Theorem [5] for the broadcast Z channel. 
Since H{Y\U) >N-q/p- h{(3ip), 



H{Z\U)>F:,^, ,^,{q,N-q/p-h{P,p)) (172) 

^YX ZX 

= N-F*^^^^^Jq,q/p-h{P,p)) (173) 

= iV . ^ . h{P2P) (174) 
p 

= N-^-h{(3ip(3A). (175) 
P 

These steps are justified as follows: 

• fll72p follows from the definition of F*; 

• f ll73p follows from Theorem [5l 
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• fll74p follows from the expression of the function F* for the broadcast Z channel in fl59l) : 
. dnSD follows from /5a = Pr{^ = 0|y = 0} = /^a/A. 



Appendix V 

Proof of [7^ : Plugging j = 1 in (!76|) . we have 



H{Y^\W2,--- ,WK)-H{Y^\Wu■■■ ,WK)>N^h{Pih)-Nqh{P,)-o{e) (176) 



or 



i7(l-i|W^2,-- - >ivf/i(/?itO-o(e), (177) 



since 



• • • , Py^) = f^(l^i|X) (178) 



N 



Y.H{Y,,\X) (179) 

i=l 
TV 

Y,H{Yi,,\X;) (180) 



i=l 



5^Pr(X, = 0)/i(/?i) (181) 



1=1 



iVg/i(/3i). (182) 



Some of these steps are justified as follows: 



I fll78p follows from the fact that X is a function of (VTi, ■ ■ ■ , Wk)] 

< fll79p follows from the conditional independence of Yi^i,i = 1, ■ ■ ■ , N, given X; 

< fll80p follows from the conditional independence of Yi^i and {Xi, ■ ■ ■ , Xi_i, Xj+i, • ■ ■ , X^) 
liven Xi. 
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Inequality (I177p indicates that 



H{Yj\W,+,, ■■■ ,Wk)> N^h{(3,t,) - o(e), (183) 

is true for j = 1. The rest of the proof is by induction. We assume that (11830 is true for j, 
which means 



HiYj\W,+i,--- ,Wk)>N 



(184) 



= N-^h{(3,{t, + ^)), (185) 

where the function r(e) ^ as e ^ 0, since j-h{(3jtj) is continuous in tj. Applying Lemma 
[5] to the Markov chain (VTj+i, ■ ■ ■ , W^) — X — 1^ — > we have 



if(lS-+i|iy,+i, ■ ■ ■ , PFiv) > iV-^/^(/5,+i(t, + ^)) (186) 



= ivf /i(/?,+it,) + o(e). (187) 

Considering (!76|) for j ' + 1, we have 

i/(lS-+i|l^,+2r ■ ■ , W^x)-i^(l"i+i|W^,+ir ■ ■ , W^x) > Ar-^/i(/5,+^^^^^ 

(188) 

Substitution of ffTSTD in ffTSSD yields 

i7(lS-+i|WG+2, ■ ■ ■ , W^ic) > iV-^/i(/3,+it,+i) - o(e), (189) 

which establishes the induction. Finally, for j > d, N6 should be added to the right side of 
(11841) because of the presence of 6 in fl^ for j = d, and hence, of N6 in (17^ . 
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Appendix VI 



Proof of Lemma [771- Let Gtyx,Tzx = {Gi,- ■ ■ ,Gi}. For any {C,,ri) G C*^, where Px = 
(1 — q,qPj^)^, one has {PxiiiV) ^ Since Lemma [HI and Corollary [1] also hold for the 
discrete multiplicative DBC, {GjPx,C,,f]) G C for all j = 1, ■ ■ ■ ,1. By the convexity of the 
set C, 

I 

(9,e,r/) = ($^G,P^,e,r7)eC, (190) 
i=i 

where q = Yl]=iGjPx- Since Qtyx,Tzx ^ gi'oup, for any permutation matrix G' e 
Gtyx 

G'q =YP'GjPx =YP^Px = q (191) 
j=i j=i 

Hence, the i^^ entry and the j*'^ entry of q are the same if G' permutes the i^^ row to the 
j*^ row. Since, for any discrete multiphcative DBC, the set Gtyx,Tzx maps any non-zero 
element in {0, 1, ■ ■ ■ , n} to any other non-zero element, all entries except the first entry of q 
are the same as each other. Furthermore, no matrix in Qtyx,Tzx ^laps the zero element to a 
non-zero element, hence the first entry of q is the same as the first entry of Px- Therefore, 
g = (1 - q,qu^Y- This implies that (^, r/) G CI^.^^^^t^t, and hence C*^ C Cl^.^^^^Ty- 
Therefore, C* = Uqg[o,i] '^(i-?,?^*^)^- Q-E-D- 

Appendix VII 

Proof of Lemma [TM- il){pxi\) is the lower envelope of (f{q,Px,X) in Px- For Px = 
(1 — q,qu^)^, suppose the point {px,ip{Pxj ^)) is the convex combination of + 1 points 
{{qi: Pi)i ^{qiiPii '^)) oil the graph of ip{q, Px, A) with weights Wt for i = 1, ■ ■ ■ ,n+ 1. There- 
fore, 

n+l 

q = J2wiqu (192) 

i=l 
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n+1 

u^^WiPi, (193) 

i=l 
n+1 

i=l 

Since V'(p, A) > ip{u, A) for the group-additive degraded broadcast sub-channel, 

f{qi,PiA) > f{qi,u,x). (195) 

Therefore, the convex combination of n -|- 1 points ((g^, u), (p{qi, u, A)) with weights Wj has 

n+1 

^WiQi^q, (196) 



i=l 



and 

n+1 n+1 



'^Wi(p{qi,u,X) <^Wi(p{qi,Pi,X) =ijj{px,X)- (197) 

i=l i=l 

On the other hand, since ip{Px, X) is the lower envelope oi(p{q, p^, A) in Px, Yli=i w^iplqi, u, A) > 
ip{Px, X) and hence Y17=i '^i'Pi^li^ ^) — V'(Px> ''^)- Therefore, ■0((1 — qu^)^, A), the lower 
envelope of (pipxi ^) Px ^i^^ Px = (!"<?) qu^Y can be attained as the convex combination 
of points on the graph of (p{q, u, A) in the dimension of q. Q.E.D. 
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